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Summary. We investigate optimal choices for the (outer) iteration method to use 
when solving linear systems with Neuberger's overlap operator in QCD. Different for- 
mulations for this operator give rise to different iterative solvers, which are optimal 
for the respective formulation. We compare these methods in theory and practice 
to find the overall optimal one. For the first time, we apply the so-called SUMR 
method of Jagels and Reichel to the shifted unitary version of Neuberger's operator, 
and show that this method is in a sense the optimal choice for propagator compu- 
tations. When solving the "squared" equations in a dynamical simulation with two 
degenerate flavours, it turns out that the CG method should be used. 
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1 Introduction 

Recently, lattice formulations of QCD respecting chiral symmetry have at- 
tracted a lot of attention. A particular promising such formulation, the so- 
called overlap fermions, has been proposed in [9]. From the computational 
point of view, we have to solve linear systems involving the sign function 
sign(Q) of the (hermitian) Wilson fermion matrix Q. These computations are 
very costly, and it is of vital importance to devise efficient numerical schemes. 

A direct computation of sign(Q) is not feasible, since Q is large and sparse, 
whereas sign(Q) would be full. Therefore, numerical algorithms have to follow 
an inner-outer paradigm: One performs an outer Krylov subspace method 
where each iteration requires the computation of a matrix-vector product 
involving sign(Q). Each such product is computed through another, inner 
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iteration using matrix-vector multiplications with Q. In an earlier paper [12] 
we investigated methods for the inner iteration and established the Zolotarev 
rational approximation together with the multishift CG method [5] as the 
method of choice. 

In the present paper we investigate optimal methods for the outer iteration. 
We consider two situations: the case of a propagator computation and the 
case of a pseudofermion computation within a dynamical hybrid Monte Carlo 
simulation where one has to solve the "squared" system. As we will see, the 
optimal method for the case of a propagator computation is a not so well- 
known method due to Jagels and Reichel [8] , whereas in the case of the squared 
system it will be best to apply classical CG on that squared system rather 
than using a two-pass approach. 

This paper is organized as follows: We first introduce our notation in Sec- 
tion 2. We then discuss different equivalent formulations of the Neuberger 
overlap operator and establish useful relations between the eigenpairs of these 
different formulations (Section 3). We discuss optimal Krylov subspace meth- 
ods for the various systems in Section 4 and give some theoretical results on 
their convergence speed based on the spectral information from Section 3. 
In Section 5 we compare the convergence speeds both, theoretically and in 
practical experiments. Our conclusions will be summarized in Section 6. 



2 Notation 

The Wilson-Dirac fermion operator, 

M = 1- nD w , (1) 

represents a nearest neighbour coupling on a four-dimensional space-time lat- 
tice, where the "hopping term" Dw is a non-normal sparse matrix, see (23) 
in the appendix. The coupling parameter k is a real number which is related 
to the bare quark mass. 

The massless overlap operator is defined as 

D = I + M- (MtM)-5. 

For the massive overlap operator, for notational convenience we use a mass 
parameter p > 1 such that this operator is given as 

D = pl + M -(M f M)-$. (2) 

In the appendix, we explain that this form is just a scaled version of Neu- 
berger's original choice, and we relate p to the quark mass, see (27) and (28). 

Replacing M in (2) by its hermitian form Q, see (24), the overlap operator 
can cquivalcntly be written as 



D = pi + 75 sign(Q) = 75 • (pj 5 + sign(Q)), (3) 
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with 75 being denned in (25) and sign(Q) being the standard matrix sign func- 
tion. Note that p-f 5 + sign(Q) is hermitian and indefinite, whereas 75 sign(Q) 
is unitary. 

To reflect these facts in our notation, we define: 

D u = pi + 75 sign(Q), D h = pj 5 + sign(Q), 

where D u — 75-Dfc. Both these operators are normal, i.e. they commute with 
their adjoints. 



3 Formulations and their spectral properties 
3.1 Propagator computations 

When computing quark propagators, the systems to solve are of the form 

D u x = {pi + 75 sign(Q))x = b. (4) 

Multiplying this shifted unitary form by 75, we obtain its hermitian indef- 
inite form as 

D h x = (p7 5 + sign(Q))x = 756. (5) 

The two operators D u and Dh are intimately related. They are both nor- 
mal, and as a consequence the eigenvalues (and the weight with which the 
corresponding eigenvectors appear in the initial residual) solely govern the 
convergence behaviour of an optimal Krylov subspace method used to solve 
the respective equation. 

Very interestingly, the eigenvalues of the different operators can be ex- 
plicitly related to each other. This allows a quite detailed discussion of the 
convergence properties of adequate Krylov subspace solvers. To see this, let 
us introduce an auxiliary decomposition for sign(Q): Using the chiral repre- 
sentation, the matrix 75 on the whole lattice can be represented as a 2 x 2 
block diagonal matrix 

where both diagonal blocks / and —I are of the same size. Partitioning sign(Q) 
correspondingly gives 

sign(Q)=(^). (7) 

In Lemma 1 below we give a convenient decomposition for this matrix that is 
closely related to the so-called CS decomposition (see [6, Theorem 2.6.3] and 
the references therein), an important tool in matrix analysis. Actually, the 
lemma may be regarded as a variant of the CS decomposition for hermitian 
matrices where the decomposition here can be achieved using a similarity 
transform. The proof follows the same lines as the proof for the existence of 
the CS decomposition given in [11]. 
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Lemma 1. There exists a unitary matrix X such that 

sign(Q) = x(^)xt, with X=^^y 

The matrices )P and E are real and diagonal with diagonal elements <j>j , ipj 
and Oj > 0, respectively. Furthermore, 4% + a'j = tpj + a 2 - = 1 and 

4>i = ~i>3 if o-j > (8) 

&,^e{-i,+i} if o~j = o. (9) 

Note that in the case of (8) we know that <f)j and ipj have opposite signs, 
whereas in case (9) we might have cfij = ipj = 1 or cf>j = ipj = — 1. The key point 
for X in Lemma 1 is the fact that 75 A = X75. This allows us to relate the 
eigenvalues and -vectors of the different formulations for the overlap operator 
via 4>j , ipj and aj . In this manner we give results complementary to Edwards et 
al. [2], where relations between the eigenvalues (and partly the -vectors) of the 
different formulations for the overlap operator were given without connecting 
them to sign(Q) via (f>j, ipj and aj. The following lemma gives expressions for 
the eigenvectors and -values of the shifted unitary operator D u . 

Lemma 2. With the notation from Lemma 1, letx^ andx^j be the j-th column 



of Xi and X 2 , respectively. Then spec(D tl ) = {A"±} with 



\l ± =p + ^±i^j\-<i>) if aj^O 

X l+ =P + fai X l- =P-^3 ^ a 3 = °- 
The corresponding eigenvectors are 

*1 ■ ( : 'P ) *f ^0 



(10) 



Proof. With the decomposition for sign(Q) in Lemma 1, we find, using 

At 75 = 75 At, 



At 75 sign(Q)X = 75 At sign(Q)X 



Since the actions of X and X^ represent a similarity transform, we can easily 
derive the eigenvalues and eigenvectors of 75 sign(Q) from the matrix on the 
right. This can be accomplished by noticing that this matrix can be permuted 
to have a block diagonal form with 2x2 blocks on the diagonal, so that the 
problem reduces to the straightforward computation of the eigenvalues and 
eigenvectors of 2 x 2 matrices. This concludes the proof. 
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As a side remark, we mention that Edwards et al. [2] observed that the 
eigenvalues of D u can be efficiently computed by exploiting the fact that most 
of the eigenvalues come in complex conjugate pairs. Indeed, using Lemma 1 
and the fact that -f 5 X — Xj 5 we see that we only have to compute the 
eigenvalues of the hermitian matrix 5n (and to check for one or minus one 
eigenvalues in 5*22)- 

With the same technique as in Lemma 2 we can also find expressions for 
the eigenvalues and eigenvectors of the hermitian indefinite formulation. 

Lemma 3. With the same notation as in Lemma 2, we have that spec(Dh) = 
{A£±} with 



=p + <j>j, = -p + ipj if o-j=0. 
The corresponding eigenvectors are 

( {<t>) P + \^ , 

z j,± = \ n 15. „i 1 */ o'j^o, 




4+ = \ : )■ = I a i -/ = () - 

As an illustration to the results of this section, we performed numerical 
computations for two sample configurations. Both are on a 4 4 lattice with = 
6.0 (configuration A) and — 5.0 (configuration B) respectively. 4 Figure 1 
shows plots of the eigenvalues of M for these configurations. We used k = 
0.2129 for configuration A, k = 0.2809 for configuration B. 

Figure 2 gives plots of the eigenvalues of D u and for our example 
configurations with p = 1.01. It illustrates some interesting consequences of 
Lemma 2 and Lemma 3: With C(p, 1) denoting the circle in the complex plane 
with radius 1 centered at p, we have 

spec(D u ) C C(p,l), (11) 

and, moreover, spec(D u ) is symmetric w.r.t. the real axis. On the other hand, 
spec(Dh) is "almost symmetric" w.r.t. the origin, the only exceptions corre- 
sponding to the case aj = 0, where an eigenvalue of the form p + with 
4>j G {±1} not necessarily matches —p + ipj with ipj £ {±1}, since we do not 
necessarily have ipj — —<fij- Moreover, 

spec(A0 C [-(p + 1), -(p - 1)] U [(p - 1), (p + 1)]. (12) 

4 The hopping matrices for both configurations are available at www.math.uni- 
wuppertal.de/org/SciComp/preprints.html as well as matlab code for all meth- 
ods presented here. The configurations are also available at Matrix Market as 
conf6.0_0014x4.2000.mtx and conf 5 . 0_0014x4 . 2600 .mtx respectively. 



6 G. Arnold et al. 




Fig. 2. Eigenvalues of D u (left column) and Dh (right column) all for p = 1.01. The 
upper plots are for configuration A, the lower for configuration B. 
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Finally, let us note that we have some aj equal to zero as soon as is an eigen- 
value of the massless operator 75 + sign(Q), i.e., as soon as the configuration 
has non-trivial topology. 

3.2 Dynamical simulations 

In a simulation with dynamical fermions, the costly computational task is 
the inclusion of the fermionic part of the action into the "force" evolving the 
gauge fields. This requires to solve the "squared" system 

D\D u x = b ^> D 2 h x = b. 

We will denote D n the respective operator, i.e., 

D n = D 2 = DlD u . 

If we plug in the definition of D u , we find that 

D n y = DlD u y = ((p 2 + 1)1 + p( 75 sign(Q) + sign(Q) 75 )) y = b, (13) 

from which we get the interesting relation 

D u Dl = l5 DlD ul5 = D\D U . (14) 

As is well known, (13) becomes block diagonal, since, using (6) and (7), we 
have 

Dn ~\ (p 2 + l)I-2 P S 22 ) y - b - (15) 

In practical simulations, the decoupled structure of (15) can be exploited 
by running two seperate CG processes simultaneously, one for each (half- 
size) block. Since each of these CG processes only has to accomodate a part 
of the spectrum of D ni convergence will be faster. An alternative usage of 
the block structure is to compute the action of the matrix sign function on a 
two-dimensional subspace corrresponding to the two blocks and to use a block 
type Krylov subspace method. We do not, however, pursue this aspect further, 
here. As another observation, let us note that if 756 = ±6, then computing 
D n b requires only one evaluation of the sign function instead of two ("chiral 
projection approach"). 

As with the other formulations, let us summarize the important spectral 
properties of the squared operator D n . 

Lemma 4. With the notation of Lemma 1, spec(D„) = {A™ ± } with 

A™ ± = 1 + 2<f> jp + p 2 (double eigenvalue) if aj ^ 
= (p + <t>j) 2 , A j; _ = (-p + ^) 2 if o-j = 0. 
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The corresponding eigenvectors are the same as for D h or, equivalently, the 
same as for D u or, again equivalently, 

Notice also that spec(D„) satisfies 

spec(D„) C [(p-l) 2 ,(p+l) 2 ]. (16) 

4 Optimal Krylov subspace methods 
4.1 Propagator computation 

Let us start with the non-hermitian matrix D u . Due to its shifted unitary form, 
there exists an optimal Krylov subspace method based on short recurrences to 
solve (4) . This method was published in [8] and we would like to term it SUMR 
(shifted unitary minimal residual). SUMR is mathematically equivalent to full 
GMRES, so its residuals r m = b — D u x m at iteration m are minimal in the 
2-norm in the corresponding affine Krylov subspace x° + K m (D u ,r°) where 
K m (D u ,r°) = span{r°, D u r°, . . . , _D" l_1 r }. From the algorithmic point of 
view, SUMR is superior to full GMRES, since it relics on short recurrences and 
therefore requires constant storage and an equal amount of arithmetic work 
(one matrix vector multiplication and some vector operations) per iteration. 
The basic idea of SUMR is the observation that the upper Hessenberg matrix 
which describes the recursions of the Arnoldi process is shifted unitary so that 
its representation as a product of Givens rotations can be updated easily and 
with short recurrences. For the full algorithmic description we refer to [8]. 

Based on the spectral properties of D u that we identified in Section 3, we 
can derive the following result on the convergence of SUMR for (4). 

Lemma 5. Let x k be the k-th iterate of SUMR applied to (4) and let r k be 
its residual. Then the following estimate holds: 

||r;i| 2 <2.Q) ||rS|| 2 . (17) 

Proof. Since D u is normal, its field of values F(D U ) = {(D u x,x), \\x\\2 = 1} 
is the convex hull of its eigenvalues so that we have F(D U ) C C(p,l), the 
disk centered at p with radius 1. A standard result for full GMRES (which is 
mathematically equivalent to SUMR) now gives (17), see, e.g., [7]. 

Let us proceed with the hcrmitian operator Dh from (5), which is highly 
indefinite. The MINRES method is the Krylov subspace method of choice for 
such systems: It relies on short recurrences and it produces optimal iterates 
x k in the sense that their residuals r\ = j^b — DhX k are minimal in the 2-norm 
over the affine subspace x° + K m (Di l , r°). 
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Lemma 6. Let x k be the iterate of MINRES applied to (5) at iteration k and 
let r\ be its residual. Then the following estimate holds: 

/ I \ L^/2J 

||^||2<2.^-J ||rg|| 2 . (18) 

Here, [k/2\ means k/2 rounded downwards to the nearest integer. 

Proof. (18) is the standard MINRES estimate (see [7]) with respect to the 
information from (12). 

As a last approach to solving the propagator equation, let us consider the 
standard normal equation 

D u D\z = b,x = D\z. (19) 

Note that there exists an implementation of the CG method for (19) known 
as CGNE [6] which computes x k = D^z k and its residual with respect to (4) 
on the fly, i.e., without additional work. 



4.2 Dynamical simulations 

We now turn to the squared equation (13). Since D n is hermitian and positive 
definite, the CG method is the method of choice for the solution of (13), its 
iterates y m achieving minimal error in the energy norm (see Lemma 7 below) 
over the affinc Krylov subspacc y° + K m (D n , r°). 

Lemma 7. Let y k be the iterate of CG applied to (13) at stage k. Then the 
following estimates hold (y* — D~ x b) 

\\y k -y*\\D n < 2 -Q) \\y -v*\\D n , (20) 

\\y k -y*h<2-^±\(^ k \\y -y*\\ 2 . (21) 

Here, |j • ||£> n denotes the energy norm \\y\\D n = \J y^D n y. 

Proof. The energy norm estimate (20) is the standard estimate for the CG 
method based on the bound cond(D„) < ((p + l)/(p— l)) 2 for the condition 
number of D n . The 2-norm estimate follows from the energy norm estimate 
using 

Wvh < \/\\Dn%- \\y\\ Dn < VTOb • hh 

with \\D-%<l/(p-lf, \\D n \\ 2 <{p+lf. 
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5 Comparison of methods 

Based on Lemma 5 to 7 we now proceed by theoretically investigating the 
work for each of the three methods proposed so far. We consider two tasks: 
A propagator computation where we compute the solution x from (4) or (5), 
and a dynamical simulation where we need to solve (13). 



5.1 Propagator computation 

The methods to be considered are SUMR for (4), MINRES for (5) and CGNE 
for (19). Note that due to (14), Lemma 7 can immediately also be applied to 
the CGNE iterates z k which approximate the solution z of (19). In addition, 
expressing (20) in terms of x k = D^z k instead of z k turns energy norms into 
2-norms, i.e. we have 

\\x k -x*\\ 2 <2-(^j H^-^lb. (22) 

In order to produce a reasonably fair account of how many iterations we 
need, we fix a given accuracy e for the final error and calculate the first 
iteration k{e) for which the results given in Lemmas 5 and 6 and in (22) 
guarantee 

II x k{e) - x* || 2 < £• || r° |j 2 , x* solution of (4), 

where r° — b — D u x°, x° being an identical starting vector for all three meth- 
ods (most likely, x° = 0). Since k{e) will also depend on p, let us write 
fc(e, p). The following may then be deduced from Lemma 5 and 6 and (22) in 
a straightforward manner. 

Lemma 8. (i) For SUMR we have 

II x k -x* h<~\\ r k u || 2 <_1_(I) .|| r ° || 2 , 
and therefore 

fc (e,p)<^W + - ln(2/( ^ 1)) 



(ii) For MINRES we have using ||r°|| = ||r°||, since r° = 75 r° 



Hp) Hp) 

r °h\\ = \K\\, since i h 



* k II2 < — — T II r k \\ 2 < "At ' f^V 2J ■ II r °h Ih. 



and therefore 



fc(£ ,p)<2-(^W + - ln(2/(p - 1)) 



Hp) Hp) 
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(in) For CGNE we have 



k 



y - (-] ■ \\x°-x* || 2 <2- ■— i T . || , 2 . 



and therefore 

k(e,p) < 



\n(e) , -ln(2/(p-l)) 



ln(p) ln(p) 



The arithmetic work in all these iterative methods is completely dominated 
by the cost for evaluating the matrix vector product sign(Q)w. MINRES and 
SUMR require one such evaluation per iteration, whereas CGNE requires two. 
Taking this into account, Lemma 8 suggests that MINRES and CGNE should 
require about the same work to achieve a given accuracy s, whereas SUMR 
should need only half as much work, thus giving a preference for SUMR. 
Of course, such conclusions have to be taken very carefully: In a practical 
computation, the progress of the iteration will depend on the distribution of 
the eigenvalues, whereas the numbers k(e,p) of Lemma 1 were obtained by 
just using bounds for the eigenvalues. For large values of p the theoretical 
factor two between SUMR and MINRES/CGNE on the other hand, can be 
understood heuristically by the observation that SUMR can already reduce 
the residual significantly by placing one root of its corresponding polynomial in 
the center of the disc C(p, 1) whereas for the hcrmitian indefinite formulation 
two roots are necessary in the two separate intervals. For smaller values of p 
the differences are expected to be smaller except for eigenvalues of D u close 
to p+ 1. 

Figure 3 plots convergence diagrams for all three methods for our example 
configurations. The diagrams plot the relative norm of the residual ||r' fe ||/||7 ,0 || 
as a function of the total number of matrix vector multiplications with the 
matrix Q. These matrix vector multiplications represent the work for evaluat- 
ing sign(Q)v in each iterative step, since we use the multishift CG method on 
a Zolotarev approximation with an accuracy of 10 -8 with 10 poles (configura- 
tion A) and 20 poles (configuration B) respectively to approximate sign(Q)w. 
The true residual (dotted) converges to the accuracy of the inner iteration. 
Note that the computations for configuration B arc much more demanding, 
since the evaluation of sign(Q) -v is more costly. We did not use any projection 
techniques to speed up this part of the computation. 

Figure 4 plots convergence diagrams for all three methods for additional 
examples on a 8 4 lattice (/? = 5.6, k = 0.2, p = 1.06) and a 16 4 lattice {(3 — 6.0, 
k = 0.2, p = 1.06) respectively. 

MINRES and CGNE behave very similarly on configuration A. This is to 
be expected, since the hcrmitian indefinite matrix Dh is maximally indefinite, 
so that for an arbitrary right hand side the squaring of the matrix inherent 
in D n should not significantly increase the number of required iterations. 

SUMR always performs best. The savings compared to MINRES and 
CGNE depend on p, [3 and the lattice size and reached up to 50%. 
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Fig. 4. Propagator computation: Convergence of MINRES for (5), CGNE for (19) 
and SUMR for (4). Left plot is for a 8 4 lattice, right plot for a 16 4 lattice. 
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As a side remark, let us mention that the dichotomy between hermitian 
indefinite and non-hermitian positive definite formulations is also a field of 
study in other areas. For example, [3] investigates the effect of multiplying 
some rows of a hermitian matrix with minus one in the context of solving 
augmented systems of the form 

U T o) (») = (o) {-B T o) (y) = (o) • 
5.2 Dynamical simulations 

Let us now turn to a dynamical simulation where we compute the solution 
y* from (13). The methods to be considered are CG for (13), a two-sweep 
SUMR-approach where we solve the two systems 

D\x = b, D u y = x 

using SUMR for both systems (note that is of shifted unitary form, too), 
or a two-sweep MINRES-approach solving the two systems 

D h x = b, D h y = x. 

It is now a bit more complicated to guarantee a comparable accuracy for 
each of the methods. Roughly speaking, we have to run both sweeps in the 
two-sweep methods to the given accuracy. Lemma 8 thus indicates that the 
two-sweep MINRES approach will not be competitive, whereas it does not 
determine which of two-sweep SUMR or CG is to be preferred. Our actual 
numerical experiments indicate that, in practice, CG is superior to two-sweep 
SUMR, see Figure 5. 




multiplications with Q x1() 4 multiplications with Q x 

Fig. 5. Dynamical simulation: Convergence of two-sweep SUMR (dash-dotted) and 
CG (solid) for (13) Left plot is for p = 1.01, right plot for p — 1.1. Both plots are 
for configuration A. 
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6 Conclusion 

We have for the first time applied SUMR as the outer iteration when solving 
a propagator equation (4) for overlap fermions. Our theoretical analysis and 
the numerical results indicate that this new method is superior to CGNE as 
well as to MINRES on the symmetrized equation. In practice, savings tend 
to increase with larger values of p. We achieved savings of about 50% for 
physically interesting configurations. 

We have also shown that when solving the squared equation (13) directly 
as is required in a dynamical simulation, a two sweep approach using SUMR 
is not competitive to directly using CG with the squared operator. This is in 
contrast to the case of the Wilson fermion matrix, where it was shown in [4] 
that the two-sweep approach using BiCGstab performs better than CG. 5 

In the context of the overall inner-outer iteration scheme, additional issues 
arise. In particular, one should answer the question of how accurately the 
result of the inner iteration (evaluating the product of sign(Q) with a vector) 
is really needed. This issue will be addressed in a forthcoming paper, [1]. 

Acknowledgements. G.A. is supported under Li701/4-l (RESH Forscher- 
gruppe FOR 240/4-1). N.C. enjoys support from the EU Research and Train- 
ing Network HPRN-CT-2000-00145 "Hadron Properties from Lattice QCD" . 



A Definitions 

The Wilson-Dirac matrix reads M — I — nDw where 

{D W )nm = ^2(1 - 7m) ® EV(n)$n,m-M + i 1 + 7m) ® Ufa - fi)5 n , m+fl (23) 
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00 
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-i 







10 



The Euclidean 7-matrices in the chiral representation are given as: 



7i 



The hermitian form of the Wilson-Dirac matrix is given by 

Q = 75 M, 

with 75 defined as the product 



75 := 71727374 



1 








1 











-1 











-1 



(24) 



(25) 



5 BiCGstab is not an alternative to SUMR in the case of the overlap operator, 
since, unlike BiCGstab, SUMR is an optimal method achieving minimal residuals 
in the 2-norm. 
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B Massive Overlap Operator 

Following Ncuberger [10], one can write the massive overlap operator as 

D N (p) - c ((1 + p)I + (1 - /i)M(M^M)-i) . (26) 

The normalisation c can be absorbed into the fermion rcnormalisation, and 
will not contribute to any physics. For convenience, we have set c = 1/(1 — p). 
Thus, the regularizing parameter p as defined in eq. (2) is related to p by 

p=(l + p)/{l-p). (27) 

The physical mass of the fermion is then given by 
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